Goto

Collaborating Authors

 voting score


Novelty Detection for Election Fraud: A Case Study with Agent-Based Simulation Data

arXiv.org Artificial Intelligence

In this paper, we propose a robust election simulation model and independently developed election anomaly detection algorithm that demonstrates the simulation's utility. The simulation generates artificial elections with similar properties and trends as elections from the real world, while giving users control and knowledge over all the important components of the elections. We generate a clean election results dataset without fraud as well as datasets with varying degrees of fraud. We then measure how well the algorithm is able to successfully detect the level of fraud present. The algorithm determines how similar actual election results are as compared to the predicted results from polling and a regression model of other regions that have similar demographics. We use k-means to partition electoral regions into clusters such that demographic homogeneity is maximized among clusters. We then use a novelty detection algorithm implemented as a one-class Support Vector Machine where the clean data is provided in the form of polling predictions and regression predictions. The regression predictions are built from the actual data in such a way that the data supervises itself. We show both the effectiveness of the simulation technique and the machine learning model in its success in identifying fraudulent regions.


FLock: Defending Malicious Behaviors in Federated Learning with Blockchain

arXiv.org Artificial Intelligence

Federated learning (FL) is a promising way to allow multiple data owners (clients) to collaboratively train machine learning models without compromising data privacy. Yet, existing FL solutions usually rely on a centralized aggregator for model weight aggregation, while assuming clients are honest. Even if data privacy can still be preserved, the problem of single-point failure and data poisoning attack from malicious clients remains unresolved. To tackle this challenge, we propose to use distributed ledger technology (DLT) to achieve FLock, a secure and reliable decentralized Federated Learning system built on blockchain. To guarantee model quality, we design a novel peer-to-peer (P2P) review and reward/slash mechanism to detect and deter malicious clients, powered by on-chain smart contracts. The reward/slash mechanism, in addition, serves as incentives for participants to honestly upload and review model parameters in the FLock system. FLock thus improves the performance and the robustness of FL systems in a fully P2P manner.


Symmetrization for Embedding Directed Graphs

arXiv.org Artificial Intelligence

Recently, one has seen a surge of interest in developing such methods including ones for learning such representations for(undirected) graphs (while preserving important properties) (Liang et al. 2018). However, most of the work to date on embedding graphs has targeted undirected networks and very little has focused on the thorny issue of embedding directed networks. In this paper, we instead propose to solve the directed graph embedding problem via a two-stage approach: inthe first stage, the graph is symmetrized in one of several possible ways, and in the second stage, the soobtained symmetrizedgraph is embedded using any state-ofthe-art (undirected) graph embedding algorithm. Note that it is not the objective of this paper to propose a new (undirected) graphembedding algorithm or discuss the strengths and weaknesses of existing ones; all we are saying is that whichever be the suitable graph embedding algorithm, it will fit in the above proposed symmetrization framework. Satuluri et al. proposed various ways (such as Bibliometric andDegree-discounted symmetrization) of symmetrizing a directed graph into an undirected graph, while information aboutdirectionality is incorporated via weights on the edges of the transformed graph (or applying a re-weighting scheme in case of already weighted graphs) (Satuluri and Parthasarathy 2011).


Leveraging Gaussian Process and Voting-Empowered Many-Objective Evaluation for Fault Identification

arXiv.org Machine Learning

Using piezoelectric impedance/admittance sensing for structural health monitoring is promising, owing to the simplicity in circuitry design as well as the high-frequency interrogation capability. The actual identification of fault location and severity using impedance/admittance measurements, nevertheless, remains to be an extremely challenging task. A first-principle based structural model using finite element discretization requires high dimensionality to characterize the high-frequency response. As such, direct inversion using the sensitivity matrix usually yields an under-determined problem. Alternatively, the identification problem may be cast into an optimization framework in which fault parameters are identified through repeated forward finite element analysis which however is oftentimes computationally prohibitive. This paper presents an efficient data-assisted optimization approach for fault identification without using finite element model iteratively. We formulate a many-objective optimization problem to identify fault parameters, where response surfaces of impedance measurements are constructed through Gaussian process-based calibration. To balance between solution diversity and convergence, an -dominance enabled many-objective simulated annealing algorithm is established. As multiple solutions are expected, a voting score calculation procedure is developed to further identify those solutions that yield better implications regarding structural health condition. The effectiveness of the proposed approach is demonstrated by systematic numerical and experimental case studies.


ColdRoute: Effective Routing of Cold Questions in Stack Exchange Sites

arXiv.org Artificial Intelligence

Noname manuscript No. (will be inserted by the editor) Abstract Routing questions in Community Question Answer services (CQAs) such as Stack Exchange sites is a well-studied problem. Yet, cold-start - a phenomena observed when a new question is posted is not well addressed by existing approaches. Additionally, cold questions posted by new askers present significant challenges to state-of-the-art approaches. We propose ColdRoute to address these challenges. ColdRoute is able to handle the task of routing cold questions posted by new or existing askers to matching experts. Specifically, we use Factorization Machines on the one-hot encoding of critical features such as question tags and compare our approach to well-studied techniques such as CQARank and semantic matching (LDA, BoW, and Doc2Vec). Using data from eight stack exchange sites, we are able to improve upon the routing metrics (Precision@1, Accuracy, MRR) over the state-of-the-art models such as semantic matching by 159.5%,31.84%, Keywords question routing · expert finding · cold-start problem · question answering services 1 Introduction Nowadays, the Community-based question answering sites (CQAs) such as Stack Overflow, Stack Exchange Sites, and Quora, which enable people to post questions and answers in various domains [Yang et al., 2013] have accumulated millions Aniket Chakrabarti Microsoft (work done while at The Ohio State University) Email: chakrabarti.14@osu.edu 2 Jiankai Sun et al. One important task in CQAs is to make recommendations for new questions (routing questions), that fall in three scenarios: 1) find experts. In this paper, we focus on the problem of expert finding [Xu et al., 2012,Zhao et al., 2013,Yang et al., 2013, Fang et al., 2016,Zhao et al., 2016,Zhao et al., 2017], which is to choose the right experts for answering questions posted by users in Stack Exchange, which is a network of question-and-answer (Q&A) websites containing topics in various fields. Each Stack Exchange site covers a specific topic. Usually there are two types of questions in CQAs - resolved (questions with answers) and newly posted questions (questions that have not received any answers).